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Abstract 

In  this  paper  we  describe  our  efforts  for  TREC  contex¬ 
tual  suggestion  task.  Our  goal  of  this  year  is  to  evalu¬ 
ate  the  effectiveness  of:  (1)  Preference  crawling  method 
that  as  far  as  possible  to  obtain  more  candidate  spots’ 
information  from  open- web  to  model  the  users’  inter¬ 
est  profiles;  (2)  Automatic  summarization  method  that 
leverages  the  information  from  multiple  resources  to 
generate  the  description  for  each  candidate  scenic  spot- 
s;  (3)  Hybrid  recommendation  method  that  combing  a 
variety  of  factors  to  construct  a  system  of  hybrid  rec¬ 
ommendation  system.  Finally,  we  conduct  extensive  ex¬ 
periments  to  evaluate  the  proposed  framework  on  TREC 
2014  Contextual  Suggestion  data  set,  and,  as  would  be 
expected,  the  results  demonstrate  its  generality  and  su¬ 
perior  performance. 

Introduction 

In  this  year  Contextual  Suggestion  (CS)  Track,  we  main 
aims  are  two  folds:  (1)  combing  a  variety  of  factors  which 
are  crawled  from  open- web  to  construct  a  system  of  hybrid 
recommendation  system.  (2)  Explore  a  new  description  gen¬ 
eration  method  which  combines  multiple  aspects  of  informa¬ 
tion.  Information  recommendation  is  always  a  dilemma.  It’s 
a  contradiction  by  generality  and  individuality.  Recommend 
items  need  to  make  a  compromise  between  popularity  and 
user’s  personalized  interest.  First,  the  higher  popularity  of 
items  tend  that  each  user  will  like  it,  but  it  can’t  reflect  users 
personalized  interest.  At  the  same  time,  recommending  ac¬ 
cording  to  user’s  personalized  needs  the  data  describes  the 
user’s  interest  accurately.  The  data  about  spots  crawled  from 
open- web  has  sparseness  problem,  and  it  is  difficult  to  truly 
reflect  the  personal  interest  of  each  user  and  reflects  more  of 
the  spots’  popularity. 

In  this  sense,  we  crawled  a  variety  of  indirect  information 
of  scenic  spots  from  the  open- web  such  as:  attractions,  spots 
rank,  reviews  of  spots,  etc.  using  this  information  to  reflect 
the  quality  of  spots.  Through  analysis  user  profiles,  we  can 
get  the  interest  preference  of  each  user  to  each  Category, 
and  use  spots  in  Example  as  the  training  dataset  to  train  the 
SVM  classifier  for  each  user  interest.  Then,  we  use  classifier 
to  get  the  judgments  about  like  or  dislike  for  each  user-spots 
pairs.  Finally,  we  use  the  information  crawled  from  website 
as  the  reflecting  of  spots’  popularity,  while  use  the  user’s  in¬ 


terest  which  is  analyzed  from  profiles  as  the  reflecting  of  us¬ 
er’s  personalized  interest.  In  the  recommendation  algorithm 
module,  we  combine  the  spots  popularity  and  user  person¬ 
alized  interest  to  generate  two  recommendation  algorithms, 
eventually  get  BJUTa  and  BJUTb  as  two  submitted  results. 

Our  Method 
System  Framework 

Figure  1  shows  our  system  framework.  It  mainly  consists  of 
three  parts:  (1)  Useful  information  gathering,  (2)  Examples 
labeling,  (3)  Profile  Modeling  and  Interest  classification,  (4) 
Recommendation  algorithm,  (5)  Description  generation,  (6) 
Results  generation  and  checking.  Figure  2  shows  the  legend 
of  Figure  1. 

•  Useful  information  gathering  component  mainly  crawls 
everything  that  we  need  to  rank  the  candidate  scenic  spots. 

•  Examples  labeling  component  determine  the  scenic  spots’ 
category  in  Examples  through  searching  the  internet  and 
a  small  part  of  the  manual  scenic  spots. 

•  Profile  Modeling  and  Interest  classification  componen- 
t  mainly  consists  of  two  parts:  (1)  Modeling  user  profiles; 
(2)  user-spots  Interest  classification.  Statistical  method  is 
used  to  identify  each  user  preferences  for  each  category 
of  spots.  User-spots  Interest  classification  use  the  spots  in 
Examples  as  the  training  sample  to  train  to  the  SVM  clas¬ 
sifier,  and  use  it  to  classify  the  spots  into  two  class,  user 
like  and  user  dislike. 

•  Recommendation  algorithm  component  mainly  consists 
of  two  parts:  (1)  for  each  user  -  context  pair  choose  50 
candidate  recommendation  spots.  (2)  Sort  the  50  candi¬ 
date  recommendation  spots  for  each  user  -  context  pair. 

•  Description  generation  component  mainly  utilizes  multi¬ 
resource  information  to  generate  spot’s  brief  automatical¬ 
ly.  We  also  describe  this  part  in  details  later  this  paper. 

•  Results  generation  and  checking  component  get  the  rec¬ 
ommend  spots  and  spots  briefly  together,  and  use  the  offi¬ 
cial  script  to  check  the  results  and  submit  results  to  TREC. 

Useful  Information  Gathering 

The  first  step  of  solving  the  contextual  suggestion  problem 
is  to  gather  useful  information.  Useful  information  contain- 
s  not  only  the  candidate  scenic  spots,  but  also  their  web 
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Figure  2:  Legend  of  Fig.  1. 


site,  reviews  associated  with  them  and  etc.  For  candidate 
scenic  spots  we  use  open  web  as  they  contain  comprehen¬ 
sive  information  than  ClueWebl2  does.  We  crawl  the  in¬ 
formation  such  as  candidate  scenic  spots,  main  page,  cate¬ 
gories  and  reviews  of  candidates  exclusively  from  the  web¬ 
site  http://www.tripadvisor.com/  for  all  contexts.  The  crawl¬ 


ing  is  done  per  category.  All  scenic  spots  are  crawled  in 
five  categories  from  Trip  Advisor,  i.e.,  attractions,  activities, 
restaurants,  nightlife  and  shopping. 

In  order  to  make  up  for  the  spots  data  sparse  problem,  we 
crawl  the  indirect  description  as  supplementary  information 
for  spots  as  much  as  possible,  includes:  rank  on  Trip  Advisor, 
rate  on  Trip  Ad  visor,  reviews  of  spots,  a  review  information 
includes:  content,  number,  helpful  vote  number.  We  get  the 
proportion  of  each  category  of  the  spots  in  Example  accord¬ 
ing  to  the  example  label  and  crawl  the  spots  according  to  the 
proportion  for  each  context.  Attraction,  restaurant  and  ac¬ 
tivities  crawl  120  as  the  top  for  each  category,  nightlife  and 
shopping  crawl  60  as  the  top  for  each  category.  For  some 
context  don’t  have  50  spots,  according  to  the  requirement  of 
the  location  choice  spots  as  the  supplement  which  near  the 
context  city.  Finally,  we  ensure  each  context  has  at  least  70 
spots. 

All  the  crawlers  are  developed  by  python  and  the  data  is 
stored  in  a  hierarchical  relationship.  Some  data  are  normal¬ 
ized  using  the  class  structure  of  storage  by  cpickle  module 
of  python. 

Profile  Modeling  and  Interest  classification 

We  need  to  analyze  each  user’s  interest  in  all  kinds  of  spots 
categories.  The  meaning  of  user  rating  in  Profiles  is  showed 
as  follow: 


•  4:  Strongly  interested 

•  3  Interested 

•  2:  Neutral 

•  1:  Disinterested 

•  0:  Strongly  disinterested 

•  -1 :  Website  didn’t  load  or  no  rating  given 

We  take  the  website  rate  as  the  spot  rate  in  profiles.  Rate 
three  and  four  points  are  considered  as  users  likes  the  spot 
while  zero  and  one  points  are  considered  as  users  don’t  like 
the  spot.  Point  two  is  a  neutral  rating.  First,  in  our  Examples 
labeling  module  we  classify  each  spot  in  Example,  and  then 
analyze  the  profiles  to  get  the  user  interest  to  every  Category. 
Finally,  we  get  the  probability  of  each  user  to  every  Category 
and  the  Users  interested  statistical  Fig.  3. 

We  can  see  that  the  users  of  different  categories  of  spots 
showed  obvious  interest  orientation.  For  example:  User  for 
attraction  and  restaurant  spots’  like  probability  is  higher  than 
others  and  dislike  probability  is  lower  than  others.  There¬ 
fore,  we  can  use  the  probability  of  user  interest  for  each  cat¬ 
egory  as  an  important  index  of  recommendation,  to  reflect 
the  interest  of  each  user.  Through  the  analysis  of  the  pro¬ 
files,  we  classify  spots  in  example  into  two  categories,  likes 
and  dislikes  for  each  user.  Example  spots  are  training  set  to 
train  S  VM  classifier  with  RBF  kernel  function  for  each  user. 
After  that,  each  user  will  have  a  classifier  which  can  clas¬ 
sify  a  spot  into  like  class  or  dislike  class.  So,  we  can  use 
SVM  classifier  to  classify  every  candidate  spot  for  each  user 
and  label  every  spot  with  “like”  or  “dislike”  for  each  user- 
spot  pairs.  We  call  this  “user- spot  favorite  label”.  User- spot 
favorite  label  also  reflects  each  user’s  personalized  interest. 
User  interest  probability  is  aiming  at  a  class  scenic  spots  of 
user  interest,  while  user- spot  favorite  label  is  aiming  at  a  spot 
of  user  interest.  They  all  reflect  the  personalized  interest  of 
the  user  and  recommendation  system. 

Recommendation  algorithm 

We  determine  some  criteria  of  the  recommendation  system 
according  to  the  three  evaluation  standards  of  TREC:  P@ 5, 
MRR  and  TBG.  First  of  all,  the  recommended  spots  should 
conform  to  the  requirements  of  location  in  context,  so  we 
will  use  location  as  a  criterion  of  the  recommendation  sys¬ 
tem.  Second,  we  will  use  the  rank  of  spots  on  TripAdvisor 
and  the  rate  of  the  reviews  as  the  indicators  of  spots’  quali¬ 
ty,  it  embodies  the  commonness  of  recommendation  system, 
while  we  use  the  probability  of  user  interest  for  each  catego¬ 
ry  and  the  classification  label  of  each  user-spots  pairs  as  the 
reflecting  of  the  user  personalized  interest,  it  embodies  the 
personality  of  recommendation  system. 

According  to  the  requirement  of  TREC,  we  need  recom¬ 
mend  50  spots  for  each  user-context  pair.  In  this  sense,  the 
whole  recommended  Algorithm  is  divided  into  two  process: 
(1)  choose  50  spots  for  each  user-context  pair  (2)  sort  the 
spots  of  each  user-context  pair. 

choose  50  spots  for  each  user-context  pair  For  50  can¬ 
didate  spots,  we  choose  the  spots  to  follow  the  following 
principles.  According  to  the  priority  order  of  each  criterion, 


first,  we  choose  the  spots  which  location  is  meet  the  require¬ 
ments  of  geographical  of  context.  If  all  spots’  location  are 
conform  to  the  requirements  of  geographical  of  context,  lo¬ 
cation  will  not  be  a  criteria  of  recommendation  system.  In 
order  to  guarantee  the  quality  of  TBG,  the  spots  which  con¬ 
form  the  location  requirement  will  be  selected  firstly.  Then, 
we  need  determine  the  number  of  each  category  according  to 
the  probability  of  user’s  interests.  We  determine  the  number 
of  each  user-context  pair  by  following  formula: 

numberi  =  surrii  x  p(like\categoryi )  (1) 

where  i  G  category  {attraction / activities /  restaur  ant / 
shopping /nightlife},  sumi  is  the  spots  number  of 
category i  in  the  context,  p{like\categoryi )  is  the  condition¬ 
al  probability. 

There,  numberi  is  the  upper  bound  of  category i. It  is  im¬ 
portant  to  note  that  for  the  context  that  its  spots  don’t  fit  the 
location  requirement,  we  will  prefer  to  select  the  spots  which 
location  fit  with  the  requirements  of  context.  If  this  strategy 
results  in  a  certain  number  of  spots  of  one  category  exceed¬ 
ed  the  upper  bound,  the  spots  of  this  category  will  not  be  s- 
elected  any  more.  Other  categories  of  spots  will  shrink  their 
upper  bound  in  Proportion  of  the  category’s  upper  bound. 

After  determine  the  spots’  number  of  each  category,  our 
algorithm  can  be  divided  into  two  kinds: (l)Only-rank  algo¬ 
rithm  ,only  using  rank  as  a  criteria,  sort  the  spots  of  all  cat¬ 
egories  together  by  rank  and  choose  the  spots  in  order.  (2) 
like-rank  algorithm,  using  rank  and  user-spot  label  (like  or 
dislike)  as  criterion.  We  prefer  to  select  the  spot  with  like 
label.  If  some  spots  all  have  like  label,  choose  the  spot  with 
higher  rank.  The  result  of  Only-rank  algorithm  is  BJUTa, 
while  the  result  of  like-rank  algorithm  is  B  JUTb. 

sort  spots  of  each  user-context  pair  After  choosing  the 
50  recommended  spots,  we  need  to  sort  this  50  spots  to  gen¬ 
erate  the  final  recommended  list.  Here,  we  used  two  differ¬ 
ent  algorithms  to  sort  the  50  spots,  the  difference  between 
two  algorithms  is  whether  to  use  the  user- spot  label  as  a  cri¬ 
teria  in  sortingalgorithm.  Because  the  TREC  using  P@ 5  as 
one  of  the  main  evaluation  indicators,  in  order  to  improve 
the  quality  of  the  first  five  recommended  spots,  each  spot 
sorting  algorithm  can  be  divided  into  two  parts:  (1)  first  six 
spots  sorting  (2)  rest  of  the  spots  sorting. 

•  First  six  spots  sorting 

In  order  to  guarantee  the  quality  of  P@  5,  we  recommend 
the  former  six  spots  continue  to  separate  selection  and 
sorting.  In  first  six  spots,  two  of  them  are  selected  from 
the  category  which  the  probability  of  user  like  is  the  high¬ 
est.  Two  of  them  are  selected  from  the  category  which  the 
probability  of  user  dislike  is  the  lowest.  Two  of  them  are 
selected  from  the  category  which  has  the  largest  number 
of  spots  in  recommended  50  spots.  After  this  step,  we  can 
know  the  number  of  each  category  in  first  six  spots.  We 
sort  the  spots  in  each  category  by  rank  and  select  spots 
according  to  the  amount  of  this  category  in  first  six  spots. 
Then  we  will  get  the  first  six  spots  of  BJUTA  result.  When 
we  select  the  first  six  spots,  if  we  need  all  these  spot  has 
the  user- spot  label  with  like,  we  will  get  the  first  six  spots 
of  B  JUTb. 


Like  and  Dislike  probability  of  each  classification 


Figure  3:  Users  interested  statistical. 


•  Rest  of  the  spots  sorting 

First  of  all,  we  sort  the  probability  of  user  interest  of  dis¬ 
like  for  each  category  in  ascending  way.  We  use  0.3  and 
0.8  as  two  thresholds  to  filter  each  category.  The  cate¬ 
gories  which  user  dislike  probability  less  than  0.3  are  as 
a  bunch  of  classes  and  give  priority  to  recommend.  While 
the  categories  which  user  dislike  probability  greater  than 
0.8  are  recommended  at  last.  For  the  categories  which  us¬ 
er  dislike  probability  are  in  0.3  to  0.8,  we  use  treat  the 
probability  as  one-dimensional  vector  and  use  k-means 
method  to  cluster  the  categories  into  two  clusters.  We  use 
each  category  vector  as  the  center  of  initial  cluster  and  get 
the  best  initial  cluster  center.  If  only  two  categories  are  re¬ 
mained,  they  will  be  treated  as  a  cluster.  Finally,  we  can 
get  some  cluster  which  consists  of  some  categories  that  it- 
s  dislike  probability  is  in  different  level.  Then,  we  choose 
spots  in  the  cluster  which  has  a  lower  dislike  probability 
of  cluster  center,  sorting  these  spots  by  rank  and  add  them 
to  the  recommended  list.  The  cluster  which  has  a  higher 
dislike  probability  of  cluster  center  will  be  recommended 
later.  Follow  this  rule,  and  we  can  get  the  recommended 
list  of  remaining  spots. 

At  last,  combining  the  two  results  of  (1)  (2)  and  we  will 
get  the  final  submitted  results,  BJUTa  and  BJUTb. 

Description  Generation 

There  are  mainly  two  parts  of  the  data  obtained  from  open- 
web  by  our  method:  less  brief  introductions  and  a  great  deal 


of  consumer  critics,  and  text  information  of  websites  of  spot- 
s.  Among  the  data  lots  of  disordered  information  is  included. 
For  great  evaluation  such  as  TREC,  characteristics  of  auto¬ 
matic  description  for  each  user  are  not  an  important  thing. 
Researchers  show  that  language  and  logic  are  the  prime  fac¬ 
tors  for  description.  That  is  to  say,  a  description  that  is  lucid 
always  tends  to  a  better  evaluation.  Therefore,  extraction  for 
a  whole  sentence  and  the  compound  of  other  information  are 
our  method  to  generate  automatic  description.  The  descrip¬ 
tions  are  gained  by  following  things: 

•  The  introduction  information  of  the  spot  net 

•  Reviews  has  “helpful  vote”  or  a  good  score  to  the  scenic 
spot 

•  Some  introduction  information  on  TripAdvisor 

Brief  description  of  candidate  spots  are  made  following  such 
a  template:  scenic  spot’s  name  +  introduction  information  on 
TripAdvisor  +  concrete  introduction  of  spots.  More  detailed 
methods  for  description  are  done  according  to  3  levels. 

•  Level  1 :  description  information  on  spot  website 

•  Level  2:  Critic  information  with  helpful  vote  tag  on  critics 

•  Level  3:  Critics  whose  score  provided  by  consumers  is 
better  than  average  points 

The  sentences  which  have  most  number  of  words  for  each 
level  are  extracted.  After  all  sentences  for  a  higher  level  have 
been  extracted,  sentences  at  lower  level  can  be  extracted. 
Add  extracted  sentences  until  the  brief  description  reach¬ 
es  512  bytes.  There,  data  structure  similar  to  stack  is  used. 


BJITa  1*05 


Figure  4:  Detailed  Results  of  BJUTa. 


Figure  5:  Detailed  Results  of  BJUTb. 


From  top  to  bottom  level,  complete  sentence  are  extracted 
and  add  to  automatic  description.  The  introduction  informa¬ 
tion  of  spot  website  is  obtained  by  statistics  of  word  numbers 
of  text  in  each  html  label  and  the  number  for  words  contain¬ 
ing  spot’s  name.  These  methods  not  only  guarantee  the  flu¬ 
ency  of  the  automatic  description,  but  also  make  it  best  for 
description  to  have  more  information  of  spot  advantages. 


Examples  labeling 

The  spots  in  the  example  are  provided  by  TREC.  Interests 
towards  the  example  spots  have  been  evaluated  by  users.  In 
order  to  analyze  interests  of  users,  classification  on  spots  of 
the  example  is  needed  to  get  interest  information  of  users. 
Internet  search  is  utilized  to  classify  each  spot,  whose  real¬ 
ization  method  for  is:  search  at  the  Trip  Advisor  using  spot 
name  in  the  Example  as  the  key,  and  the  scenic  spot’s  cat¬ 
egory  which  its  name  and  location  information  are  satisfied 
with  the  example  will  be  returned.  For  the  spots  that  are  not 


Table  1:  Overall  Mean  Performances. 


BJUTa 

BJUTb 

P@5 

0.5057 

0.5037 

MRR 

0.6850 

0.6700 

TBG 

2.1993 

2.2003 

found  in  Trip  Ad  visor,  we  will  label  them  by  hands.  There 
are  100  spots  in  the  Example,  among  which  87  spots  are 
searched  out,  13  spots  are  marked  by  hands. 

Submitted  Runs  and  Experiment  Results 

We  submitted  two  runs:  BJUTa  and  BJUTb.  BJUTa  only  us¬ 
es  spot  rank  and  probability  of  user  interest  in  each  category 
as  the  basis  for  selection  and  sorting  candidate’s  spots,  while 
BJUTb  uses  spot  rank  and  probability  of  user  interest  in  each 
category  and  user  favorite  label  of  each  spots  as  the  basis  for 
selection  and  sorting  candidate’s  spots.  For  description,  it 
combines  the  opening  sentence,  meta-description  of  the  we- 
b  site,  the  review  which  has  helpful  vote  tag  and  the  sentence 
from  the  reviews  which  are  good  reviews  of  the  scenic.  We 
use  10-fold  cross-validation  method  for  the  parameters  of 
RBF  kernel  function  in  SVM  training  process. 

As  for  judgments,  304  profile/context  pairs  were  sam¬ 
pled  and  judged.  There  are  four  criteria:  geographic  appro¬ 
priateness,  interestingness  of  website  and  description.  Ge¬ 
ographic  appropriateness  has  3  scales:  0,  1,  2,  with  0  not 
geographically  appropriate,  1  marginally  geographically  ap¬ 
propriate,  and  2  geographically  appropriate.  Interestingness 
of  website  and  description  has  5  scales:  0,  1,  2,  3,  4,  with 
0  strongly  uninterested,  1  uninterested,  2  neutral,  3  interest¬ 
ed,  4  strongly  interested.  Moreover,  geographic  appropriate¬ 
ness  was  judged  partially  by  NIST  assessors  and  partially 
by  users.  Scores  from  NIST  assessors  were  used  if  a  pro¬ 
file/context  pair  was  judged  by  both  of  them. 

Three  evaluation  measurements  were  reported:  5  (pre¬ 

cision  at  5),  MRR  (mean  reciprocal  rank)  and  TBG  (time- 
biased  gain).  For  5  and  MRR,  a  suggestion  is  labeled  as 
relevant  only  if  it  has  geographic  score  of  1  or2,  and  descrip¬ 
tion  score  of  3  or  4,  and  web  site  score  of  3  or  4. 

Figure  4  and  Figure  5  show  the  performances  of  our  two 
runs  in  terms  of  all  evaluation  measurements.  The  X  axis 
consists  of  all  profile/context  pairs,  ordered  alphabetically 
in  the  format  of  “profile-context”.  One  red  point  represents 
our  result  for  that  profile/context.  Three  bars  corresponding 
to  the  best,  median  and  worst  results  of  that  profile/context. 
The  yellow  line  is  the  range  which  is  above  the  average.  We 
can  see  that  our  runs  achieve  some  of  the  best  results.  Most 
of  our  results  are  equal  or  better  than  the  median  results, 
indicating  the  effectiveness  of  our  proposed  system.  Table  1 
shows  the  overall  mean  performances  of  our  runs  in  terms  of 
all  evaluation  measurements. 

Conclusion 

In  TREC  2014  Contextual  Suggestion  Track,  we  submitted 
two  runs.  Both  of  them  use  the  indirect  description  informa¬ 
tion  of  candidate  spots  and  user  interest  information  to  select 


and  sort  the  candidate  spots.  Indirect  description  informa¬ 
tion  of  candidate  spots  include:  spot’s  category,  location  and 
rank.  User  interest  information  includes:  probability  of  us¬ 
er  interest  in  each  category  and  user  favorite  label  of  each 
spots.  We  use  these  indicators  to  make  recommendation  al¬ 
gorithm.  The  spots  category,  location  information,  rank,  and 
probability  of  user  interest  in  each  category  are  used  to  get 
the  result  of  BJUTa,  while  the  spots  category,  location  infor¬ 
mation,  rank,  and  probability  of  user  interest  in  each  catego¬ 
ry  and  user  favorite  label  of  each  spots  are  used  to  get  the 
result  of  BJUTb.  Due  to  the  open- web  data  sparseness  prob¬ 
lem,  our  recommendation  algorithm  does  not  depend  on  the 
similarity  between  two  spots,  but  using  a  variety  of  indirec- 
t  description  of  scenic  spot  from  the  open  -  the  web  which 
reflect  the  quality  of  spots  and  user  profile  which  reflect  the 
user  interest  to  select  and  sort  the  candidate  spots.  We  use  a 
variety  of  information  on  the  open- web  with  whole  sentence 
extraction  method  to  generate  spots  brief  automatically. 

The  performances  of  our  two  submitted  runs  are  in  gener¬ 
al  better  than  the  median  performance.  Some  of  the  results 
are  even  best  results,  indicating  the  effectiveness  of  our  pro¬ 
posed  method. 
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